Exploring Topic-language Preferences in Multilingual Swahili Information Retrieval in Tanzania
نویسندگان
چکیده
Habitual switching of languages is a common behaviour among polyglots when searching for information on the Web. Studies in retrieval (IR) and multilingual (MLIR) suggest that part reason such regular topic search. Unlike survey-based studies, this study uses query click-through logs. It exploits querying results selection Swahili MLIR system users to explore how search (query) associated with language preferences—topic-language preferences. This article based carefully controlled using Swahili-speaking Web Tanzania who interacted guided engine. From statistical analysis queries logs, it was revealed preferences may be topics The also are not static; they vary along course from selection. In most topics, either had significantly no preference or preferred Kiswahili changed their English selecting/clicking results. findings might provide researchers more insights developing better systems support certain types scenarios.
منابع مشابه
Exploring Topic-based Language Models for Effective Web Information Retrieval
The main obstacle for providing focused search is the relative opaqueness of search request—searchers tend to express their complex information needs in only a couple of keywords. Our overall aim is to find out if, and how, topic-based language models can leads to more effective web information retrieval. In this paper we explore retrieval performance of a topic-based model that combines topica...
متن کاملCross-Language Information Retrieval in a Multilingual Legal Domain
We describe here the application of a cross-language information retrieval technique based on similarity thesauri in the domain of Swiss law. We present the theory of similarity thesauri, which are information structures deerived from corpora, and show how they can be used for cross-language retrieval. We also discuss the collections of Swiss legal documents and show how we have used them to co...
متن کاملExperiments in Multilingual Information Retrieval
The multilingual information retrieval system of the future will need to be able to retrieve documents across language boundaries. This extension of the classical IR problem is particularly challenging, as signiicant resources are required to perform query translation. At Xerox, we are working to build a multilingual IR system and conducting a series of experiments to understand what factors ar...
متن کاملExpanding a multilingual media monitoring and information extraction tool to a new language: Swahili
The Europe Media Monitor (EMM) family of applications is a set of multilingual tools that gather, cluster and classify news in currently fifty languages and that extract named entities and quotations (reported speech) from twenty languages. In this paper, we describe the recent effort of adding the African Bantu language Swahili to EMM. EMM is designed in an entirely modular way, allowing plugg...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM Transactions on Asian and Low-Resource Language Information Processing
سال: 2021
ISSN: ['2375-4699', '2375-4702']
DOI: https://doi.org/10.1145/3458671